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Abstract 

We investigate the statistical properties of the correlation matrix between individual stocks 
traded in the Korean stock market using the random matrix theory (RMT) and observe how 
these affect the portfolio weights in the Markowitz portfolio theory. We find that the distribution 
of the correlation matrix is positively skewed and changes over time. We find that the eigenvalue 
distribution of original correlation matrix deviates from the eigenvalues predicted by the RMT, and 
the largest eigenvalue is 52 times larger than the maximum value among the eigenvalues predicted 
by the RMT. The ^473 coefficient, which reflect the largest eigenvalue property, is 0.8, while one 
of the eigenvalues in the RMT is approximately zero. Notably, we show that the entropy function 
E(a) with the portfolio risk a for the original and filtered correlation matrices are consistent with 
a power-law function, E(a) ~ er~ 7 , with the exponent 7 ~ 2.92 and those for Asian currency crisis 
decreases significantly. 
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I. INTRODUCTION 



Financial markets have been known as representative complex systems, which are or- 
ganized by various unexpected phenomenon according to non-trivial interactions among 
heterogeneous agents The study of complex economic systems is not easy because we do 
not know the control parameters that govern economic systems and can not easily apply the 
parameters we do know to economic systems. However, much research has been conducted 
to understand the statistical properties of financial time series 

a a. 

In particular, the 

analysis of financial data by various methods developed in statistical physics has become a 
very interesting research area for physicists and economists Q. There is practical js-7] as 
well as scientifically important value in analyzing the correlation coefficient between stock 
return time series because this contains a significant amout of information on the nonlinear 
interactions in the financial market and is a parameter in terms of the Markowitz portfolio 
theory. The correlation matrix between stocks, which has unexpected properties due to com- 
plex behaviors, such as temporal non-equilibrium, mispricing, bubbles, market crashes and 
so on, is an important parameter to understand the interactions in the financial market jg]. 
To analyze the correlation matrix, previous studies presented various statistical methods, 
such as principal component analysis (PCA) [9|, singular value decomposition (SVD) [h]] 
and factor analysis (FA) 11] . Here, to analyze the actual correlation matrix, we employ the 
random matrix theory (RMT), which was introduced by Wigner, Dyson and Metha [12Hl5|. 
It can explain the statistical properties of energy levels in complex nuclei well. The RMT 



met 



16 



rod is a useful method for elimina ting the randomness in the actual correlation matrix 



2l|. Recently, Laloux et al (1999) 



22| and Plerou et al (1999) 27J analyzed the correla- 



tion matrix of financial time series by the RMT method. The authors found that 94% of the 
eigenvalues of correlation matrix can be predicted by the RMT, while the other 6% of the 

n 

eigenvalues deviated from the RMT. In addition, Plerou et al (2002) [24J applied the RMT 
method to a United States stock market and observed that the correlation matrix of stock 
markets consist of random and non-random parts, which have a useful information in the 
financial market. The eigenvector deviations from the RMT show a very stable state over a 
whole period. We investigate the various statistical properties of the correlation matrix of 
473 daily stock return time series traded in the Korean stock market from 1 January 1993 
to 31 May 2003. We find that the distribution of the correlation matrix is positively skewed 
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and changes over the whole time. Using the RMT method, we show that the correlation 
matrix contains meaningful information as well as random property. Notably, we show that 
for both the original, C origina i, and filtered correlation, Cfu ter , matrices the entropy func- 
tion, E(a), with the portfolio risk, cr, is consistent with a power-law function, E(a) ~ o~~ 7 , 
with an exponent 7 ~ 2.92. In the following section, we describe the data and methods 
used in this paper. In Section 3, we present the verification results. Finally, we end with a 
conclusion. 



II. DATA AND METHOD 



In this paper, we investigate the statistical properties of the correlation matrix of the 473 
daily stock returns traded on the Korean stock market from 3 January 1993 to 31 May 2003. 
The data obtained from the Korea Stock Exchanges cover 2845 days. To understand the non- 
trivial interactions, we calculated the correlation matrix between stocks for the whole period 
as well as sub-periods by shifting 21 days with 250 data points. We propose a verification 
process to analyze the statistical properties of the correlation matrix between stock returns. 
First, we estimated the statistical properties of the correlation matrices using the RMT 
method. Second, we calculate the entropy of the portfolio weights using the Markowitz 
portfolio theory. Before demonstrating the verification process, we introduces the RMT, 
which w as p roposed by Wigner, Dyson, and Metha, et al. and Markowitz portfolio theory 



(MPT) 30] introduced by Markowitz in 1952. We created N (number of company) data 
sets with L data points following iid(0, 1). Let the created data be denoted by the symbol 
G. Here, the G is a matrix (N x L) with the random elements and the correlation matrix 
is defined by 

G random ' 

where G T is the transpose of G, and the correlation between elements is approximately zero. 
If N — > 00 and L — >■ 00, the eigenvalue spectrum of RMT is calculated by using 

P (W- Q V(A+-A)(A-A_) 



2tt A 

'JV ' 



where the eigenvalues A lie within A_ < A < A + , Q = -4, and the maximum and minimum 



eigenvalue of RMT, G ranc i om , are given by 



If L and N have a limitable length, then the eigenvalue spectrum shows gradual decrease 
from the theoretical values of the largest eigenvalue predicted by the RMT. 

We next explain the MPT to select the optimal portfolio sets among all stocks. The MPT 
method introduced by Markowitz in 1952 is generically known as the mean-variance theory. 
The purpose of MPT is to minimize the portfolio risk in a given portfolio return, which can 
be quantified by the variance and defined as follows. 

N N 

Q = 2J UiUjCijaiCrj, (4) 

where Ui is the portfolio weight of stock i, which can be calculated using two Lagrange 
multipliers, <7j is the standard deviation of stock i, and C^- is the correlation coefficient 
between stock % and stock j. In this work, we use the no short-selling constraint for portfolio 
weights [4 1, i.e. we assume that all the weights are non negative numbers (ui > 0, V 
i=l,...,N). We also normalize portfolio weights in such a way that ^»=i u i = 1- The 
portfolio return, fj,, also is calculated by 

N 

/j, = ^2uiHi, (5) 

i=l 

where /ij is the mean value of stock i. We next considered the portfolio weights because these 
could determine the portfolio efficiency frontier lines. We used Shannon's entropy method 
to quantify the statistical properties of the portfolio weights since YliLi u i = 1> defined by 

JV 

E = J2~PiHPi), (6) 

where Pi is the portfolio weight Wi. 

Using the eigenvalue distribution predicted by the equation 2, we estimated a random 



part from the original correlation matrix and as the previous paper 
as follows. 



24| , divided it two parts 



C original Crandom -\- C filter- (7) 
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Based on how many random elements existed in the correlation matrix, we analyzed the 
non-trivial interactions between stocks. In addition, to estimate the eigenvalue properties, 
we created the data sets by using each eigenvector element. 

N 

R(t)=J2vMt), (8) 
i=i 

where rj(t) is the ith stock return at time t, and Vi is the ith eigenvector. To observe 
the eigenvalue properties divided by the RMT method, we created the data sets, R Random 
and R Lar9est (t), reflecting the eigenvalue properties of both C ran dom and Cf uteri respectively, 
and, by the one-factor model, widely acknowledged in the financial literature as a pricing 
model, we calculated the relationship between the created time series and the market factor, 
which influences all stocks in the market and is defined by 

n(t) = ai+ (3iR M arket(t) + Ci(f), (9) 

where Ruarket is the KOSPI market index, a.i and are the regression coefficients of stock 
i and use the f3 coefficient as the measurement to quantify the relationship between created 
data sets and market index. 



III. RESULTS 



In this section, we analyze the various statistical features of the correlation matrix of 473 
daily stock returns listed on the Korean stock markets from 3 January 1993 to 31 May. 2003 
using the random matrix theory and Markowitz portfolio theory. We present the results 
on the statistical properties of the correlation matrix, such as its distribution, eigenvalue 
spectrum and entropy of portfolio weights calculated by MPT. Fig. 1(a) and (b) show the 
distribution of the correlation matrices of the original and random data sets. Fig. 1(c) shows 
the distribution of correlation matrices calculated by shifting 21 days with 250 data points. 
Fig. 1(d) displays the average value of correlation matrices of Fig. 1(c). In Fig. 1(a), we 
find that the distribution of the correlation matrix between stocks for a whole period is 
positively skewed and shows a significant difference from that for the random interaction 
in Fig. 1(b). In Fig. 1(c), we show that the distribution of the correlation matrix changes 
considerably over the whole time. Especially, in Fig. 1(d), during the Asian currency crisis, 
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the mean values of the correlation coefficients significantly increased. In other words, the 
dynamically changes were caused by the complex behavior of the market crash, unlike the 
case of random interactions. Our findings confirm that all the possible interactions in the 
Korean stock market deviated from those for the random interaction. 

We next decompose the original correlation matrix into the random C ran d om and filter 
C futer parts using the RMT method to extract the meaningful information from the original 
correlation matrix. Fig. 2 shows the eigenvalue distribution of the correlation matrix in the 
Korean stock market. In Fig. 2, the solid-line (orange) is the eigenvalue spectrum predicted 
by the RMT, and the red circles and blue circles indicate the eigenvalue distributions of the 
original time series and random data sets, respectively. In Fig. 2, we find that the eigenvalue 
distribution of the RMT method is very similar to one from the random data, while that for 
the real time series significantly shows different behavior. Moreover, the largest eigenvalue 
is 52 times larger than the largest eigenvalue of the RMT. The large values are greater than 



24]. 



25 times those in the United States stock market 

To characterize the statistical properties of each eigenvalues, we created the return time 
series using equation [8] and calculated the slopes f3 between those and the KOSPI market 
index using equation |9j Fig. 3(a) and (b) shows the distribution of the eigenvector elements 
corresponding to both the largest eigenvalue, A473 and A100, one of eigenvalues of the RMT, 
respectively. Fig. 3(c) and (d) show the (3 coefficient between the KOSPI market index and 
the time series created. We find that the ^473 between the market index and time series is 0.8, 
while one from the time series created using the eigenvector elements predicted by the RMT is 
approximately zero. We argue that the largest eigenvalue can explain the market properties 
well, but one from the ranges predicted by the RMT is uncorrelated to the market index. 
We also decomposed the original correlation matrix according to each eigenvalue divided by 
the RMT method. Fig. 4 shows the distribution of various correlation matrices. The red 
circles, blue diamonds, black squares and pink triangles indicate the correlation matrices 
of the original, random, filter and largest eigenvalues, respectively. Through the above 
findings, we can expect that the distribution of the random correlation matrix C ran d om follows 
a Gaussian distribution, while the correlation matrix Cfut er estimated after removing the 
random components from the original correlation matrix by the RMT method has a similar 
distribution as the original time series. We found that the correlation matrix reflecting the 
largest eigenvalue property has an obvious difference from that of the original time series. 
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To apply the RMT method to a portfolio optimization problem, we analyzed the portfolio 
weights estimated by the MPT through various correlation matrices. The important param- 
eters are the return, //j, standard deviation, a and correlation matrix, CV,-, of the original 
stock returns, which are needed to calculate the portfolio weights of each stock. To calculate 
the effects of the correlation matrix among the input parameters, we apply the correlation 
matrices, Cfut er , and C ran dom divided by the RMT method. Fig. 5(a) shows the efficient 
portfolio lines created using the various correlation matrices, such as C or i g i na i, C ran dom, and 
C filter- In Fig. 5 (a), we found that the efficient frontier lines calculated with both the 
original C ori g ina i and filtered correlation matrices Cfu ter show very similar behavior, while 
that of the random correlation matrix C ran d om shows significant difference from the original 
correlation. In addition, the efficient portfolio frontier line of the random correlation matrix 
Crandom at a given portfolio risk a overestimates the portfolio return, /i, by a greater amount 
than one of the original correlation matrix. We next calculated the entropy of the portfolio 
weights with each correlation matrix, such as C or i g inai, Cfuter and C ran dom- Fig. 5(b) shows 
the relationship between the portfolio risk, a, and the entropy of the portfolio weights for 
three types of correlation matrices according to a log-log plot. We found that the entropy (a) 
for both the original and filtered correlation matrices was approximately consistent with a 
power-law function, E(a) ~ o"~ 7 with the exponent 7 ~ 2.92, while there is no the power-law 
function in the relationship between the entropy and the portfolio return, \i and presented 
in Fig 5(c). We also calculated the exponents for each sub-periods by shifting 20 days with 
500 data points to verify the stability over time the result observed in Fig. 5. We find that 
while the relationship between entropy of each portfolio weight and portfolio risk follow a 
power-law function, the exponent values, 7, calculated from each sub-periods changes over 
time and lie within 1.19 < 7 < 3.23. Especially, the 7 value calculated during the Asian 
currency crisis decreases significantly. 



IV. CONCLUSIONS 



We investigated the statistical properties of the correlation matrix between the return 
time series of individual stocks traded in the Korean stock market using the RMT method 
and observed the effect of the correlation matrix applied to the Markowitz portfolio theory. 
We found that the distribution of the correlation matrix between stocks showed a positive 
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skew and dynamically changed over time. We found that the eigenvalue distribution of the 
correlation matrix deviated from those of the RMT, and the largest eigenvalue was 52 times 
larger than the eigenvalues predicted by the RMT. The slopes f3 between market index and 
the time series corresponding to the largest eigenvalue were 0.8, while those for the RMT 
were approximately zero. Notably, we found that the entropy function E(a) of portfolio 
weights with the portfolio risk a was consistent with a power-law function, E{a) ~ a~ 7 , 
with the exponent 7 ~ 2.92, while the relationship between the entropy and portfolio return 
ji is not a power-law function. We find that while for all sub-periods the exponents calculated 
from the relationship between entropy of each portfolio weight and portfolio risk follow a 
power-law function, those for sub-periods changed over time and lie within 1.19 < 7 < 3.23. 
Especially, the exponent 7 decreases significantly during Asian currency crisis. In the next 
step, we must rigorously study the portfolio weights of other stock markets because these 
play an important role in terms of the portfolio risk and return. 
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FIG. 1: (a) and (b) show the distribution of the correlation coefficients between stocks of 473 
companies of taken from the Korean stock market and random data, respectively, (c) displays the 
distribution of the correlation matrices of the sub-periods by shifting 21 days with 251 data points 
and (d) shows the average values of each correlation matrix in (c) 
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FIG. 2: The distribution of the eigenvalues for correlation matrix estimated using the 473 companies 
listed on the Korean stock market, random data following the iid(0,l) process, and that predicted 
by the RMT method. The red circles, blue circles, and pink solid-line indicate the original time 
series, random data, and theoretical lines, respectively. 
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FIG. 3: (a) and (b) show the distribution of both eigenvectors corresponding to Aioo and A473, 
respectively, (c) and (d) display the (3 coefficients between the normalized market index and the 
time series created by equation (9) for the eigenvalues Aioo and A473. The value of both /3ioo and 
/3473 are zero and 0.8, respectively. 



13 



0.25 



0.15 



o 

□_ 




0.05 



FIG. 4: The distribution of the original correlation matrix, C or i g i na i, and those created by the 
random matrix theory, C rcm dom> Cfut er , and Ci argest , respectively. The red circles, blue diamonds, 
black squares and pink triangles indicate the correlation matrices corresponding to the original, 
random, filter and largest eigenvalue, respectively. 
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FIG. 5: (a) shows the efficient portfolio frontier lines for the original, C or i g i na i, random, C ran domi 
and filter correlation matrix, Cfut er , respectively, (b) displays the relationship between the entropy 
of the portfolio weights and the portfolio risk, (c) displays the relationship between the entropy 
and the portfolio return. 
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